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Abstract 

In  this  paper,  we  first  discuss  computational  experience  using  the  SRI  up¬ 
date  in  conventional  line  search  and  trust  region  algorithms  for  unconstrained 
optimization.  Our  experiments  show  that  the  SRI  is  very  competitive  with 
the  widely  used  BFGS  method.  They  also  indicate  two  interesting  features: 
the  final  Hessian  approximations  produced  by  the  SRI  method  are  not  gener¬ 
ally  appreciably  better  than  those  produced  by  the  BFGS,  and  the  sequences 
of  steps  produced  by  the  SRI  do  not  usually  seem  to  have  the  ’’uniform  linear 
independence"  property  that  is  assumed  in  some  recent  convergence  analysis. 
We  present  a  new  analysis  that  shows  that  the  SRI  method  with  a  line  search 
is  n+1  step  q-superlinearly  convergent  without  the  assumption  of  linearly  in¬ 
dependent  iterates.  This  analysis  assumes  that  the  Hessian  approximations 
are  positive  definite  and  bounded  asymptotically,  which  from  our  computa¬ 
tional  experience  are  reasonable  assumptions. 
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1.  Introduction 


This  paper  is  concerned  with  secant  (quasi-Newton)  methods  for  finding  a  local  minimum 
of  the  unconstrained  optimization  problem 

min  f(x).  (1.1) 

It  will  be  assumed  that  f(x)  is  at  least  twice  continuously  differentiable,  and  that  the 
number  of  variables  n  is  sufficiently  small  to  permit  storage  of  an  ti  x  n  matrix,  and 
0(n~)  or  possibly  0(n 3)  arithmetic  operations  per  iteration. 

Algorithms  for  solving  (1.1)  are  iterative,  and  the  basic  framework  of  an  iteration  of 
a  secant  method  is: 


Given  current  iterate  xc ,  f(xc),  V/(xc)  or  finite  difference  approximation,  and 
Bc  €  RnXn  symmetric  (secant  approximation  to  V2/(xc)): 

Select  new  iterate  x+  by  a  line  search  or  fist  region  method  based  on  quadratic 
model  m(xc  +  d)  =  f(xc)  +  V f(xc)Td  +  ^dT Bcd. 

Update  Bc  to  B+  such  that  B+  is  symmetric  and  satisfies  the  secant  equation 
B+sc  =  yc,  where  sc  =  x+  -  xc  and  yc  =  V f(x+)  -  V/( xc). 


In  this  paper,  we  consider  the  SRI  update  for  the  Hessian  approximation, 

(j k  ~  Bcsc)(yc  -  Bcsc)t 


B, 


Bc  + 


sJ(<Jc  ~  Bcsc) 
and,  for  purpose  of  comparison,  the  BFGS  update 


B+  —  Bc  + 


yey, 


J  ,  BcscsjBc 


C  I  T 

Vc  yc 


side 


(1.2) 


(1.3) 


For  background  on  these  updates  and  others  see  Fletcher  [1980],  Gill,  Murray,  and 
Wright  [1981].  and  Dennis  and  Schnabel  [1983]. 

The  BFGS  update  has  been  the  most  commonly  used  secant  update  for  many  years. 
It  makes  a  symmetric,  rank  two  change  to  the  previous  Hessian  approximation  Bc,  and 
if  Bc  is  positive  definite  and  sj yc  >  0,  then  B+  is  positive  definite. 

The  BFGS  method  has  been  shown  by  Broyden,  Dennis,  and  More  [1973]  to  be  locally 
7-superlinearly  convergent  provided  that  the  initial  Hessian  approximation  is  sufficiently 
accurate.  Powell  [1976]  proved  a  global  superlinear  convergence  result  for  the  BFGS 
method  when  applied  to  strictly  convex  functions  and  used  in  conjunction  with  line 
searches  that  satisfy  the  conditions  of  Wolfe  [1968].  The  BFGS  update  has  been  used 
successfully  in  many  production  codes  for  unconstrained  optimization. 

The  SRI  formula,  on  the  other  hand,  makes  a  symmetric  rank  one  change  to  the 
previous  Hessian  approximation  Bc.  Compared  with  other  secant  updates,  the  SRI  up¬ 
date  is  simpler  and  may  require  less  computation  per  iteration  when  unfactored  forms 
of  updates  are  used.  (Factored  updates  are  those  in  which  a  decomposition  of  Bc  is 
updated  a l  each  iteration.)  A  basic  disadvantage  to  the  SRI  update,  however,  is  the 
fact  that  its  denominator  may  be  zero  or  nearly  zero,  which  causes  numerical  instability. 
A  simple  remedy  to  this  problem  is  to  set  B+  =  Bc  whenever  this  difficulty  arises,  but 
this  may  prevent  fast  convergence.  Another  problem  is  that  the  SRI  update  may  not 
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preserve  positive  definiteness  even  if  this  is  possible,  i.e.,  when  Bc  is  positive  definite  and 
-'T  >Jc  >  o. 

Fiacco  and  McCormick  [  1 9G-S]  showed  that  if  the  SRI  update  is  applied  to  a  positive 
definite  quadratic  function  in  a  line  search  method,  then,  provided  that  the  updates  are 
all  well  defined,  the  solution  is  reached  in  at  most  n  +  1  iterations.  Furthermore,  if  n  +  1 
iterations  are  required,  then  the  final  Hessian  approximation  is  the  actual  Hessian  at  the 
solution.  This  result  is  not  true,  in  general,  for  the  BFGS  update  or  other  members  of 
the  Brovden  family,  unless  exact  line  searches  are  used. 

For  nonquadratic  functions,  however,  convergence  of  the  SRI  is  nof  as  well  understood 
as  convergence  of  the  BFGS  method.  In  fact,  Brovden,  Dennis,  and  More  [1973]  have 
shown  that  under  their  assumptions  the  SRI  update  can  be  undefined,  and  thus  their 
convergence  analysis  can  not  bc  applied  in  thio  case.  Also,  no  global  convergence  result 
similar  to  that  for  the  BFGS  method  given  by  Powell  [1976]  exists,  so  far,  for  the  SRI 
method  when  applied  to  a  non-quadratic  function. 

Recent  work  by  Conn,  Gould,  and  Toint  [1988a,  198Sb.  1991]  has  sparked  renewed 
interest  in  the  SRI  update.  Conn,  Gould,  and  Toint  [1991]  proved  that  the  sequence  of 
matrices  generated  by  the  SRI  formula  converges  to  the  actual  Hessian  at  the  solution 
V2/(j.),  provided  that  the  steps  taken  are  uniformly  linearly  independent,  that  the  SRI 
update  denominator  is  always  sufficiently  different  from  zero,  and  that  the  iterates  con¬ 
verge  to  a  finite  limit.  (Using  this  result  it  is  simple  to  prove  that  the  rate  of  convergence 
is  ry-superlinear.)  On  the  other  hand,  for  the  BFGS  method  Ge  and  Powell  [1983]  proved, 
under  a  different  set  of  assumptions,  that  the  sequence  of  generated  matrices  converges 
but  not  necessarily  to  V2/(x«). 

The  numerical  experiments  of  Conn,  Gould,  and  Toint  [19SSb]  indicate  that  minimiza¬ 
tion  algorithms  based  on  the  SRI  update  may  be  competitive  computationally  with  meth¬ 
ods  using  the  BFGS  formula.  The  algorithm  used  by  Conn,  Gould,  and  Toint  [19S8b]  is 
designed  to  solve  problems  with  simple  bound  constraints,  i.e,  /,  <  x,  <  ut,  i  =  1,  2, . . . ,  n. 
The  bound  constraints  are  incorporated  into  a  box  constrained  trust  region  strategy  for 
calculating  global  steps,  in  which  an  inexact  Newton’s  method  oriented  towards  large 
scale  problems  is  used.  This  method  uses  a  conjugate  gradient  method  to  approximately 
solve  the  trust  region  problem  at  each  iteration,  and  also  incorporates  a  new  procedure 
that  allows  the  set  of  active  bound  constraints  to  change  substantially  at  each  iteration. 
In  this  context,  Conn.  Gould,  and  Toint  [1988b]  conclude  that  the  SRI  performance  is,  in 
general,  somewhat  better  than  the  BFGS  in  terms  of  iterations  and  function  evaluations 
on  their  test  problems.  They  point  out  that  the  use  of  a  trust  region  removes  a  main 
disadvantage  of  SRI  methods  by  allowing  a  meaningful  step  to  be  taken  even  when  the 
approximation  is  indefinite.  They  also  point  out  that  the  skipping  technique  used  when 
the  SRI  denominator  is  nearly  zero  was  almost  never  used  in  their  tests.  They  attribute 
part  of  the  success  of  the  SRI  to  the  possible  convergence  of  the  updates  to  the  true 
second  derivatives  as  discussed  above.  In  Conn,  Gould,  and  Toint  [1991],  they  tested 
this  convergence  using  random  search  directions.  These  tests  showed  that,  in  comparison 
with  other  updates  such  as  the  BFGS,  the  DFP,  or  the  FSB.  the  SRI  generates  more 
accurate  Hessian  approximations. 

Tim  purpose  of  this  paper  is  to  better  understand  the  computational  and  theoretical 
properties  of  the  SR  1  update  in  the  context  of  basic  line  search  and  trust  region  methods 
for  unconstrained  op  tiinization.  In  the  next  section,  we  present  computational  results  we 
obtained  for  tie-  SR  1  and  the  BFGS  methods  using  standard  line  search  and  trust  region 
nleorithms  for  small  to  medium  sized  unconstrained  optimization  problems.  We  also 
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report  on  tests  of  the  convergence  of  the  sequence  of  Hessian  approximation  matrices, 
generated  by  the  SRI  and  BFGS  formulas,  and  of  the  condition  of  uniform  linear 
independence  of  the  sequence  of  steps  which  is  required  by  the  theory  of  Conn,  Gould, 
and  Toint  [1991].  These  results  indicate  that  this  assumption  may  not  be  satisfied  for 
many  problems.  Therefore  in  Section  3,  we  prove  a  new  convergence  result  without  the 
assumption  of  uniform  linear  independence  of  steps.  Instead,  it  requires  the  assumption 
of  boundedness  and  positive  definiteness  of  the  Hessian  approximation.  In  Section  4, 
we  present  computational  results  regarding  the  positive  definiteness  of  the  SRI  update, 
and  an  interesting  example.  Finally,  in  Section  5  we  make  some  brief  conclusions  and 
comments  regarding  future  research. 

2.  Computational  Results  and  Algorithms 

In  this  section,  we  present  and  discuss  some  numerical  experiments  that  were  conducted 
in  order  to  test  the  performance  of  secant  methods  for  unconstrained  optimization  using 
the  SRI  formula  against  those  using  the  BFGS  update. 

The  algorithms  we  used  are  from  the  UNCMIN  unconstrained  optimization  software 
package  (Schnabel,  Koontz,  and  Weiss  [1985])  which  provides  the  options  of  both  line 
search  and  trust  region  strategies  for  calculating  global  steps.  The  line  search  is  based  on 
backtracking,  using  quadratic  or  cubic  modeling  of  f(x)  in  the  direction  of  search,  and  the 
trust  region  step  is  determined  using  the  “hook  step”  method  to  approximately  minimize 
the  quadratic  model  within  the  trust  region.  The  frameworks  of  these  algorithms  are 
given  below. 

Algorithm  2.1  Quasi-Ncw-ton  method  (Line  Search) 

Step  0  Given  an  initial  point  io,  an  initial  positive  definite  matrix  B0,  and  a  =  10~4,  set 
k  (iteration  number)=  0. 

Step  1  If  a  convergence  criterion  is  achieved,  then  stop. 

Step  2  Compute  a  quasi-Newton  direction 

Pk  =  ~{Bk  +/tfc/)-IV/(xfc) 

where  pk  is  a  nonnegative  scalar  such  that  pk  —  0  if  Bk  is  safely  positive  definite, 
else  pk  >  0  is  such  that  Bk  +  pkI  is  safely  positive  definite. 

Step  3  {Using  a  backtracking  line  search,  find  an  acceptable  steplength.} 

(3.1)  Set  Afc  =  1. 

(3.2)  If  f(xk+ 1)  <  f(xk)  +  aXkV  f(xk)Tpk,  then  go  to  Step  4. 

(3.3)  If  first  backtrack,  then  select  the  new  Xk  such  that  xk+l(Xk)  is  the  local  min- 
imizer  of  the  one-dimensional  quadratic  interpolating  f(xk),  V f(xk)T pk,  and 
f(xk  X-  pk)  but  constrain  the  new  A*.,  to  be  >  0.1,  else  select  the  new  A^  such 
that  xk+i(Xk)  is  the  local  minimizer  of  the  one-dimensional  cubic  interpolating 
/(-Tfc),  A f(xk)Tpk,  f(xi t+i(Apret.)),  and  f(xk+i(X2pTKv))  but  constrain  the  new 
Xk  to  be  in  [0. lApr(a..  0.5Apret.]. 

(  J/t+i(A)  =  xk  +  Xpk  and  Xprcv,  X2prFV  =  previous  two  steplengths. ) 

(3. 1 )  Go  to  (3.2). 
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Step  4  Set  xk+\  =  xk  +  A kPk- 

Step  5  Compete  the  next  Hessian  approximation  Bk+ 1- 

Step  6  Set  k  —  k  +  1,  aiul  go  to  Step  1. 

Algorithm  2.2  Quasi-Newton  method  (Trust  Region) 

Step  0  Given  an  initial  point  xq,  an  initial  positive  definite  matrix  Bq,  an  initial  trust 
region  radius  Ao,  Vi  €  (0,1)  and  r/2  >  l.  set  k  -  0. 

Step  1  If  a  convergence  criterion  is  achieved,  then  stop. 

Step  2  If  Bk  is  not  positive  definite  set  Bk  —  Bk  +  fikI  where  /.if.  is  such  that  B k  =  B;  +  /.ikI 
is  safely  positive  definite,  else  set  Bk  =  Bk- 

Step  3  {Compute  trust  region  step  by  hook  step  approximation.} 

Find  an  approximate  solution  to 

min  V f{xk)T s  +  ^sTBks  subject  to  ||s||  <  Ak 

by  selecting 

Sk  =  -( Bk  +  ^i/')~1V/(xA.),  vk  >  0 
such  that  ||5*||  €  [0.75A^,  1.5  A*].  or 

Sk  =  -B^Vfixk), 

if  !l-^fc  1  V/(x<;)||  <  l-5Afc. 

Step  4  Set  ared*  =  f(xk  +  sk)  -  f(xk). 

Step  5  If  aredfc  <  10-4V  f(xk)Tsk,  then 

(5.1)  set  xfc+1  =  xfc  +  sk; 

(5.2)  calculate  pred<..  =  V/(xi)rsi  +  \slBksk; 

(5.3)  if  re  ,k  <  0.1,  then  set  Aj..+1  =  At/2,  else  if  are^fc  >  0.75,  then  set  Ak+\  = 

pred^  pred^ 

2A k,  otherwise  A^+i  =  A*,.; 

(5.4)  go  to  Step  7; 

Step  6  else 

(6.1)  if  relative  steplength  is  too  small,  then  stop;  else  calculate  the  A*.,  for  which 
Xk  +  A^-.Sfc  is  the  miniinizer  of  the  one-dimensional  quadratic  interpolating 
/(xfc),  f(ik  +  Sk),  and  V f{xk)Tsk ;  set  new  Ak  =  AQIsQI,  but  constrain  new 
A k  to  be  between  0.1  and  0.5  times  current  Afc. 

(6.2)  go  to  Step  3. 

Step  7  Compute  the  next  Hessian  approximation.  Bk+\- 
Step  8  Set  k  -  k  +  1.  and  go  to  Step  1. 
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Procedures  for  updating  A*,.  in  Step  3  of  the  line  search  algorithm  are  found  in  Algo¬ 
rithm  A6.3.1  of  Dennis  and  Schnabel  [19S3].  While  a  steplength  A*  >  1  is  not  considered 
in  the  reported  results,  in  our  experience  permitting  A*  >  1  makes  very  little  difference 
on  these  test  problems.  Procedures  for  finding  i/k  in  Step  3  of  the  trust  region  algorithm 
are  found  in  Algorithm  A6.4.2  of  Dennis  and  Schnabel  [1983],  and  are  based  on  Heb- 
den  [1973]  and  More  [1977].  In  both  algorithms,  the  procedure  for  selecting  Hk  in  Step 
2  is  found  in  Gill,  Murray,  and  Wright  [1981].  (They  give  an  algorithm  for  finding  a 
diagonal  matrix  D,  such  that  Bk  +  D  is  safely  positive  definite.  If  D  =  0,  then  ^  is  set 
to  0,  else  an  upper  bound  6i  on  Hk  is  calculated  using  the  Gerschgorin  circle  theorem, 
and  jj.k  is  set  to  min{61,62}  where  62  =  max{[D]„-,  1  <  i  <  n}.)  In  our  experience,  when 
Bk  is  indefinite,  Hk  is  quite  close  to  the  most  negative  eigenvalue  of  Bk,  so  that  the 
algorithm  usually  finds  an  approximate  minimizer  of  the  quadratic  model  subject  to  the 
trust  region  constraint. 

Both  algorithms  terminate  if  one  of  the  following  stopping  criteria  is  met. 

(1)  The  number  of  iterations  exceeds  a  given  upper  limit. 

(2)  The  relative  gradient,  max  <  |[V/(j*:)]1j 

l<t<n  ( 

gradient  tolerance. 

(3)  The  relative  step,  max  ( [^t-].]}!  .g  jegs  tjjan  a  gjven  step  tojer. 

max{  x*+1  ;  ,  1}  /’ 

ance. 

41  the  algorithms  used  B0  =  I . 

2.1.  Comparison  of  the  SRI  and  the  BFGS  Methods 

Using  the  above  outlined  algorithms,  we  tested  the  SRI  method  and  the  BFGS  method 
on  a  variety  of  test  problems  selected  from  More,  Garbow,  and  Hillstrom  [1981]  and  from 
Conn,  Gould,  and  Toint  [1988b]  (see  Table  Al  in  the  appendix.)  First  derivatives  were 
approximated  using  finite  differences.  The  gradient  stopping  tolerance  used  was  10~5, 
and  the  step  tolerance  was  (machine  epsilon)1/2.  The  upper  bound  used  on  the  number 
of  iterations  was  500.  As  done  in  Conn,  Gould,  and  Toint  [1988b],  we  skipped  the  SRI 
update  if  either 

\sl(ijk  -  BkSk)\  <  r||si.-||||l/fc  -  BkSkl 

where  r  =  10~8,  or  if  ||R^.+i  —  Bk\\  >  108.  The  BFGS  update  was  skipped  if  sjyk  < 
(machine  epsilon ) 1  //2 ) H-s* |] |]  1/fc || -  All  experiments  were  run  using  double  precision  arith¬ 
metic  on  a  Pyramid  POO  computer  that  has  a  machine  epsilon  of  order  10~16. 

For  each  test  function,  Tables  A2  and  A3  in  the  appendix  report  the  performance 
of  the  SRI  and  BFGS  methods  using  line  search  and  trust  region  respectively.  The 
tables  contain  the  number  of  the  function  as  given  in  the  original  source  (see  Table  Al), 
the  dimension  of  the  problem  (n),  the  number  of  iterations  required  to  solve  the  problem 
(itrn.).  the  number  of  function  evaluations,  (f-eval.),  required  to  solve  the  problem  (which 
includes  n  for  each  finite  difference  gradient  evaluation),  and  the  relative  gradient  at  the 
solution  (rgx).  The  last  column  (sp)  indicates  whether  the  starting  point  used  is  x0. 
10xi),  or  100xo.  where  x0  is  the  standard  starting  point. 

In  order  to  compare  the  performance  of  the  two  methods  with  respect  to  the  number 
of  iterations  and  the  number  of  function  evaluations  required  to  soi\e  these  problems. 


maoc{|[xjk+i],-|,  1} 
max{|/(xA.+1)|,l} 


is  less  than  a  given 


fi 


we  consider  problems  solved  by  both  methods  and  calculate  the  ratio  of  the  mean  of  the 
number  of  iterations  (or  function  evaluations)  required  to  solve  these  problems  by  the 
SRI  method  to  the  corresponding  mean  for  the  BFGS  method.  Table  1  below  reports  the 
ratios  of  these  means,  using  both  arithmetic  mean  and  geometric  mean.  These  numbers 
indicate  that  on  the  set  of  test  problems  we  used,  the  SRI  is  10%  to  15%  faster  and 
cheaper  than  the  BFGS  method. 

Table  2  gives  the  number  of  problems  where  the  SRI  method  requires  at  least  5,  10, 
20,  30,  40,  50  iterations  less  than  the  BFGS  method,  and  vice  versa.  This  table,  which 
is  based  on  numbers  from  Table  A2,  also  indicates  the  superiority  of  the  SRI  on  these 
problems. 


Table  1:  Ratio  of  SRI  Cost  to  BFGS  Cost 


Mean 

Line  Search 

Trust  Region 

Itrn. 

Function  Evaluations 

Itrn. 

Function  Evaluations 

Arithmetic 

0.S2 

0.S3 

0.S4 

0.88 

Geometric 

0.S3 

0.S5 

0.84 

0.92 

Table  2:  Comparisons  of  Iterations 


Line  Search 

Trust  Region 

Iterations  Different 

5 

30 

40 

50 

5 

10 

20 

30 

SRI  Better 

26 

20 

13 

10 

7 

3 

9 

5 

1 

BFGS  Better 

7 

5 

2 

2 

1 

1 

8 

6 

3 

1 

1 

1 

2.2.  Error  in  the  Hessian  Approximation  and  Uniform  Linear  Independence 

In  an  attempt  to  understand  the  difference  between  the  SRI  and  the  BFGS,  we  tested 
how  closely  the  final  Hessian  approximations  produced  by  the  line  search  and  trust  region 
SRI  and  BFGS  algorithms  come  to  the  exact  Hessians  at  the  final  iterates.  Recall  that 
the  Hessian  error  for  the  SRI  is  analyzed  by  Conn,  Gould,  and  Toint  [1991]  under  the 
assumption  of  uniform  linear  independence  which  we  redefine  here. 

Definition  A  sequence  of  vectors  {s*,}  in  Rn  is  said  to  be  uniformly  linearly  independent 
if  there  exist  (  >  0,  k0  and  m  >  n  such  that,  for  each  k  >  kQ,  one  can  choose  n  distinct 
indices  k  <  k\  <  . . .  <  kn  <  k  +  m  such  that  the  minimum  singular  value  of  the  matrix 

=  [WIKII'-'-^/lk-J]  is  >  c- 

Using  this  definition,  Theorem  2  of  Conn.  Gould,  and  Toint  [1991]  proves  the  follow¬ 
ing. 

Theorem  2.1  (Conn,  Gould,  and  Toint  [1991])  Suppose  that  f(x)  is  twice  continuously 
differentiable  everywhere,  and  that  V2/(.r)  is  bounded  and  Lipschitz  continuous,  that  is, 
there  exist  constants  M  >  0  and  7  >  0  such  that  for  all  x.y  <E  Rn, 

!:rV(J)ll  <  V  and  |r2/(  -)  -  V2/(y)S|  <  ->||*  -  yj|. 


Lot  x,t+i  =  Xk  +  sk  ,  where  {sk}  is  a  uniformly  linearly  independent  sequence  of  steps, 
and  suppose  that  lim  {x^.}  =  x ,  for  some  x.cRn.  Let  {Bk}  be  generated  by  the  SRI 


formula 


Bk+ 1  —  Bk  + 


(yk  -  BkSk)(tjk  ~  BkSk)T 

sl(yk~  BkSk) 


where  Bo  is  symmetric,  and  suppose  that  VA:  >  0,  yk  and  sk  satisfy 


\sk{yk  -  Bksk) |  >  rllsjtlHlj/jt  -  Bksk\\, 


(2.1) 


for  some  fixed  r  €  (0,  1).  Then  lim  \\Bk  -  V2/(x„)|!  =  0. 

k—+oc 

We  now  present  some  computational  tests  to  determine  to  what  extent  such  Hessian 
convergence  occurs  in  practice.  For  these  tests  we  used  analytic  gradients  and  a  gradient 
stopping  tolerance  of  10-10  and  computed  the  quantity 


||H,-V2/(x,)j|/||V2/(x/)j|, 


where  x;  is  the  solution  obtained  by  the  algorithm,  and  B\  is  the  Hessian  approximation 
at  x;.  These  results  are  reported  in  Tables  A4  and  A5  in  the  appendix  and  summarized 
in  Tables  3  and  4  below.  Tables  3  and  4  list  for  each  method,  the  number  of  problems 
for  which  ||i?(  -  V2/(x;)||/|| V2/(x/)||  lies  in  a  given  range. 


Table  3:  Number  of  Problems  with  \\Bi  -  V2/(x/)||/||  V2/(x/)||  in  Indicated  Range  (Line 
Search  Methods) 


*r 

1 

o 
* — < 

VI 

[Hr4, 1(H) 

[10~3, 10-'2) 

[10~2, 10-1) 

[10-U) 

>  1 

SRI 

4 

3 

2 

8 

3 

8 

BFGS 

_ 2J 

o 

_ Lj 

10 

6 

8 

Table  4:  Number  of  Problems  with  \\Bi  -  V2/(x;)j|/|j  v  2f(+t)\\  in  Indicated  flange  (Trust 
Region  Methods) 


<  10-* 

[10~3, 10-2) 

[10-2, 10"1) 

[10"M) 

>  i 

SRI 

o 

0 

4 

5 

4 

10 

BFGS 

0 

0 

5 

7 

7 

9 

While  the  SRI  seems  to  produce  slightly  better  final  approximations  than  the  BFGS. 
there  is  no  evidence  from  these  tables  that  it  significantly  outperforms  the  BFGS  with 
respect  to  convergence  to  the  actual  Hessian  at  the  solution.  Also,  in  a  good  number  of 
cases,  neither  method  comes  close  to  the  correct  Hessian. 

The  lack  of  convergence  of  the  SRI  Hessian  approximations  to  the  correct  value  in 
many  of  these  problems  may  appear  to  conflict  with  the  analysis  of  Conn,  Gould  and 
Ioint  [1001]  given  in  Theorem  2.1.  In  fact,  there  are  two  possible  explanations  for 
this  apparent,  conflict:  either  the  algorithm  has  not  converged  closely  enough  for  the 
final  convergence  of  the  matrices  to  be  apparent  (  tl.;^  p  hard  to  test,  in  finite  prcN.dun 


8 


arithmetic)  or  an  assumption  of  Theorem  2.1  must  bo  violated.  The  two  assumptions  of 
Theorem  2.1  that  could  possibly  be  invalid  are  that  the  denominator  of  the  SRI  is  not 
too  small  (2.1 ).  and  the  uniform  linear  independence  condition.  In  our  experiments,  (2.1) 
was  violated  at  most  once  for  each  test  problem,  and  so  this  assumption  does  not  appear 
to  be  a  problem  in  the  SRI  method.  Thus,  we  decided  to  test  whether  the  uniform  linear 
independence  condition  is  satisfied  for  those  problems. 

Since  the  uniform  linear  independence  condition  would  be  very  hard  to  test  due  to 
the  freedom  to  choose  rn  and  r,  we  have  tested  a  weaker  condition.  For  each  value 

r  =  10~‘,i  =  1,2 . 3,  we  computed  the  number  of  steps  (say  m)  required  so  that  the 

smallest  singular  value  of  the  matrix,  5m,  composed  of  the  final  normalized  m  steps  of  the 
algorithm,  is  >  r  (5m  =  [s;/|[.s;||,  ||, . . . ,  s/_(m_i)/||s,_(m_1)],  where  m  >  n). 

Tables  A6  and  AT  contain  the  results  of  these  experiments,  which  are  summarized  in 
Tables  5  and  6  below.  A  entry  in  Tables  A6  and  AT  means  that  the  smallest  singular 
value  is  <  r  even  if  all  the  iterates  are  used. 

Those  results  indicate  that  the  uniform  linear  independence  assumption  does  not  seem 
to  hold  for  all  problems,  especially  those  with  large  dimensions.  Therefore  in  the  next 
section  we  develop  a  convergence  result  for  the  SRI  method  that  does  not  make  this 
assumption. 


Table  5:  Number  of  Problems  where  <7min(Sm)  >  7  for  m/n  in  Indicated  Range.  -  Line 
Search  SRI  Method _ 


T 

m/n  in 

Never 

[1.2) 

[2.3) 

[3-4) 

[•1-5) 

[5-10) 

10“  I 

i 

1 

3 

3 

6 

8 

10“2 

1 

0 

3 

5 

< 

10“  3 

12 

1 

0 

4 

4 

7 

Table  6:  Number  of  Problems  where  crmm(5m)  >  r  for  m/n  in  Indicated  Range.  -  Trust 
Region  SRI  Method _ _  _ _ 


T 

m/n  in 

Never 

[1.2) 

[2.3) 

[3-4)  1  [4-5) 

[5-10) 

10“ 

4 

3 

0  [  3 

G 

12 

TM 

1 

O 

12 

1 

0  |  3 

3 

7 

10“  8 

13 

0 

_ 0J _ L 

5 

t 

3.  Convergence  Rate  of  the  SRI  Without  Uniform  Linear  Indepen¬ 
dence 

As  was  indicated  at  the  end  of  the  previous  section,  the  condition  of  uniform  linear 
independence  of  the  sequence  {-H-}  under  which  Conn,  Gould,  and  Toint  [1991]  analyze 
the  performance  of  the  S1U  method  may  be  too  strong  in  practice.  Therefore  in  tit  is 
section  we  consider  the  convergence  rate  of  the  SRI  method  without  this  condition.  \Y«> 
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will  show  the-  i!  we  drop  the  rendition  of  uniform  linear  independence  of  { .s ^ }  but  add 
instead  tie'  assumption  that  the  sequence  {/?<,}  remains  positive  definite  and  bounded, 
t !:•':>  die  line  search  Algorithm  ‘2.1  generates  at  least  p  (/-superlinear  steps  out  of  every 
/>  r  p  steps.  This  will  enable  us  to  prove  that  convergence  is  2n-stcp  (^-quadratic. 

The  basic  id*-a  behind  our  proof  is  that,  if  any  step  falls  close  enough  to  a  subspace 
spanned  by  m  <  n  recent  steps,  then  the  Hessian  approximation  must  be  quite  accurate 
in  this  subspace.  Thus,  if  in  addition  the  step  is  the  full  secant  step  —  Z?^1  V/(x;J.  it 
should  be  a  superlinear  step.  But  in  a  line  search  method,  for  the  step  to  be  the  full 
secant  step.  BK  must  be  positive  definite,  which  accounts  for  the  new  assumption  of 
positive  definiteness  of  Bk  at  the  good  steps.  In  Section  4  we  will  show  that  empirically 
this  assumption  seems  very  sound,  although  counterexamples  are  possible. 

Throughout  this  section  the  following  assumptions  will  frequently  be  made: 
Assumptions 

T1  The  function  /  has  a  local  minimizer  at  a  point  x.  such  that  X'2f{im)  is  positive 
definite,  and  its  Hessian  V*/(x)  is  Lipschitz.  continuous  near  i.,  that  is,  there  exists 
a  constant  7  >  0  such  that  for  all  x,  y  in  some  neighborhood  of  x.. 

HV-/(x)  -  v-/(y)!l  <  -/l!-r  -  y!l- 


.5.2  I  he  sequence  {j\.}  converges  to  the  local  minimizer  xm. 

We  first  state  the  following  result,  due  to  Conn,  Gould,  and  Toint  [1991],  which  does  not 
assume  linear  independence  of  the  step  directions  and  which  will  be  used  in  the  proof  of 
the  next  lemma. 

Lemma  3.1  Let  {x;.}  be  a  sequence  of  iterates  defined  by  x*.+]  =  n  +  Sk-  Suppose  that 
Assumptions  3.1  and  3.2  hold,  that  the  sequence  of  matrices  {Bk}  is  generated  from  {x*} 
by  the  SRI  update,  and  that  for  each  iteration 

1 -  fikfik) I  >  rMlli/fc  -  IhskW  (3.1) 


where  r  is  a  constant  £  (0.  1).  Then,  for  each  j.  j| —  Z7;^.j.s;||  =  0.  and 


\\!Jj  ~ 

for  all  i  >  j  *  2.  where  q, =  max 
constant  from  Assumption  3.1. 


{||xp  —  x,|j  |  j  <  $  <  p  <  i},  and  7  is  the  Lipschitz 


Actually,  it  is  apparent  from  the  proof  of  Lemma  3.1  by  Conn,  Gould,  and  Toint 
that,  if  fhe  update  is  skipped  whenever  (3.1)  is  violated,  then  (3.2)  still  holds  for  all  j 
for  w  hich  f  3. 1  1  is  t  rue. 

In  the  lemma  below,  we  show  that  if  the  sequence  of  steps  generated  by  an  iterative 
promts  u-ing  the  SRI  update  satisfies  (3.1).  and  the  sequence  of  matrices  is  bounded, 
then  out  of  any  set  of  u  a  ]  steps,  at,  least  one  is  very  good.  As  in  the  previous  lemma, 
condition  (3.1  !  amually  need-.  only  hold  at  this  set  of  n  1  steps,  as  long  as  the  update 
i-  not  made  wleui  ‘hat  <,,nd':tion  fails. 


in 


Lemma  3.2  Suppose  the  assumptions  of  Lemma  3.1  are  satisfied  for  the  sequences  {j't} 
and  {  B and  that  in  addition  there  exists  M  for  which  ||f?k||  <  M  for  all  k.  Then  there 
exists  I\  >  0  such  that  for  any  set  of  n  +  1  steps  5  -  {.s^  :  A’  <  A.q  <■■  •  <  &n+1 }  there 
exists  an  index  km  with  rn  €  {'2,3, . . .,  ?i  +  1}  such  that 


-  vVujhj!  ^  _ 
“i — ii - <  ce 


l/n 

5 


w  here 


and 


= 


max  {\\xk  -  i« ||} 

l<j<n  +  l  U1  ’ 


c  =  4 


+  +A/  +  !|V2/(*.)|| 


Proof.  Given  for  j  =-  1  +  1  define 


Sj 


Skt  Sk2  Sk} 

KiriKr"’Kii 


We  will  first  show  that  3m  €  [2,  n  -f  1]  such  that  =  5m_iu  -  w,  Sm- i  has 

full  column  rank  and  is  well  conditioned,  and  |]u'||  is  very  small.  (In  essence  either 
m  =  n  +  1.  spans  n-space  well,  and  w  =  0,  or  m  <  n  +  1,  Sm_i  has  full  rank  and 

is  well  conditioned,  and  skm  is  nearly  in  the  space  spanned  by  Sm- 1-)  Then,  using  the 
fact  that  ( Bkm  -  V •  }{x.))Sm-\  is  small  due  to  the  Hessian  approximating  properties  of 
the  SRI  update  given  in  Lemma  3.1  above,  and  that  w  is  small  by  this  construction,  we 
will  have  the  desired  result. 

For  j  €  {1, . . .,  n),  let  r}  be  the  smallest  singular  value  of  Sj  and  define  rn+1  =  0. 
Note  that 

1  =  m  >  r2  . . .  >  rn+1  =  0 


Let  rn  be  the  smallest  integer  for  which 


T’m  1  /n 

: —  <  Rs  • 

‘m  —  1 


(3.3) 


Then  since  m  <  n  - f  1  and  7[  =  1, 


(m-2)/n 


(n  — l)/n 


f  Tm  —  l 

Vm-2 


(3.4) 


Since  x r-  —  x..  we  may  assume  without  loss  of  generality  that  (.$  £  (0-(q)n)  for  all 
k.  Now  we  choose  c  c  Rrn  such  that 


||Nmz||  =  r„, !  |  -  j  ] .  (3.5) 


u 

-1 

J 


11 


where  u  £  Rm  1 .  (The  last  component  of  z  is  nonzero  due  to  (3.3).)  Let  w  =  S, 
from  the  definition  of  Sm  and  z. 


Sk 


m 


IW-Ji 


Sm-iu  -  w. 


Since  rm_j  is  the  smallest  singular  value  of  Sm_i  we  have  that 


“II  < 


< 

By  (3.4)  this  implies  that 

11“ 

Also,  using  (3.5)  and  (3.7),  we  have  that 

IMI2  =  HSmr||2 
=  TmlM|2 
=  rm(l  +  ll“H2) 

<  T~+(~ -)2(W  +  i)2. 

\Tm- 1  / 

Therefore,  since  (3.3)  implies  that  rm  <  c n,  using  (3.3), 

ii hi2  <  4/n  +  4/T,(Hi  +  i)2 

<  44/n(|iu1|+l)2. 


i 


-||Sm_iu|| 


^"m  — 1 

-^-Ik  +  iiT2 

Tm-l  \\sk„ 

||tn|[  +  1 

1 


< 


Hl  +  i 

(n-l)/n  ' 


This  implies  that 

iMi(i-2eyn)<2eyn 

and  hence,  ||u'||  <  1,  since  <  (^)n-  Therefore,  (3.8)  and  (3.9)  imply  that 


“II  < 


2 


(n— l)/n 
\S 


II  “'ll  <  44/n. 

This  gives  the  desired  result  that  w  is  small,  as  well  as  a  necessary  bound  on  | 
Now  we  show  that  \\(Bkj  ~  V2/(x,))SJ-_1||,  j  E  [2,n|  1],  is  small.  Note 
result  is  independent  of  the  choice  of  j.  By  Lemma  3.1  we  have  that 


II//.  -  *v,l| 


z.  Then, 

(3.6) 

(3.7) 

(3.8) 


(3.9) 

(3.10) 

(3.11) 

“II- 

that  this 

(3.12) 
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for  all  i  €  (fci, £•>, . . Also,  letting 

Gi  =  /  V2/(x,  4-  ts,)dt , 

Jo 

we  have 

G,s,  =  /  V2/(x,  +  tsi)sidt 

Jo 

=  V/(xt+i)  —  V/(x,) 

=  2/m 

and  by  the  Lipschitz  continuity  of  V2/(x), 

II*  -  V2/(*vMI  =  ||(C,  -  V2/(x.))s,|| 

=  II  f\^2/(xi  +  tSi)-  V2/(x.))M*|| 
Jo 

<  Ik, ||  jfVV(*i  +  **)-V2/(x.)||<fc 

<  7 Ik, II  /  l|x,-  +  Ls,  -  x„||«tt 

xo 

<  7lk«-lk5, 


(3-13) 


where  7  is  the  Lipschitz  constant.  Therefore,  using  the  triangle  inequality,  and  (3. 12) 
and  (3.13)  we  have 

||(5,J-V2/(x.))|j^||  <  ll(2/,-B,Jj^||  +  ||(yt-V/(xO)i~|| 

<  (2c +  7  )cs, 

7  (2  \fcn+i-fcl~2 

where  c  =  —  (  -  +  1  )  ,  and  hence  for  any  j  €  [2,n  +  1] 

||(5fcj-V2/(x.))5J_l||  <  y/n(2c  + 

Finally,  using  (3.6),  (3.14)  with  =  m,  (3.11)  and  f 3 .10)  we  have  that 

||(5, m  -  V2/(x,)Km|| 


(3,14) 


Ikfcmll 


=  ||(5fcm-V2/(x.))(5m-,u-uO|| 

<  \\{Bkm  -  V2/(x.))5m_i||||u|| 

+  Pu-V2/(x.)||||rx| 

<  \\u\\\/n(2c  +  7  )€S  +  IMKHflUl  +  l|V2/(-^) 


< 


(^D/n)  M‘2c  +  7)f5 


+  44/n(.y  +  ||v2/(x.)||) 


<  4  [v^(c +  -,)+.!/  + ||V2/(x.)||]c, 


l/n 

5 


-  1  /n 

ces  a 


In  order  to  use  this  lemma  to  establish  a  rate  of  convergence  we  need  the  following 
result  which  is  closely  related  to  the  well-known  supcrlinear  convergence  characterization 
of  Demis  and  More  [1974]. 


so  that 

Sk  =  -(x*-x *)+V2/(*.rl  [(V2/(x.)  -  /?*>*  -  v/(x*)  +  V2/(x.)(x*  -  *-)]  •  (3-15) 
Therefore,  using  Taylor’s  theorem  and  Assumption  3.1, 

Ik*  -  *-  +  5*11  <  l|V2/(x.r1||  [||(V2/(X.)  -  Bk)sk II  +  •  (3.16) 

Now  it  follows  from  (3.15)  that  if  ||V2/(x.)~1||||(/?fc  -  V2/(x„))sfc||/||sfc||  <  ^  then  by 
Taylor’s  theorem, 

lk*ll  <  |[l k*  -  *«ll  +  l|v2/(*.r1ll|lk*  -  *.||2]  <  211  ik  -  2.11, 

if  ek  is  sufficiently  small.  Using  this  inequality  together  with  (3.16)  gives  the  result.  □ 

Using  these  two  lemmas  one  can  show  that  for  any  p  >  n,  Algorithm  2.1  will  generate 
at  least  p  -  n  superlinear  steps  every  p  iterations  provided  that  Bk  is  safely  positive 
definite,  which  implies  that  B k  is  not  perturbed  in  Step  2  and  =  0.  In  the  following 
theorem,  this  is  proved  and  used  to  establish  a  rate  of  convergence  for  Algorithm  2.1 
under  the  assumption  that  the  sequence  { Bk }  becomes,  and  stays,  positive  definite.  In 
a  corollary  we  show  that  this  implies  that  the  rate  of  convergence  for  Algorithm  2.1  is 
2n-step  q-quadratic.  As  we  will  see  in  the  next  section,  our  test  results  show  that  the 
positive  definiteness  condition  is  generally  satisfied  in  practice.  We  are  assuming  here 
that  if  Bk  is  positive  definite,  then  it  is  not  perturbed  in  Step  2,  i.e.,  we  are  assuming 
that  “safely  positive  definite”  just  means  positive  definite. 

Theorem  3.1  Consider  Algorithm  2.1  and  suppose  that  Assumptions  3.1  and  3.2  hold. 
Assume  also  that  for  all  k  >  0, 

I sli'jk  ~  BkSk) I  >  rMIlifc  -  tffc«fc||, 


for  a  fixed  r  €  (0,  1),  and  that  3.1/  for  which  ||Z?jt||  <  M  Vfc.  Then,  if  3A'0  such  that  Bk 
is  positive  definite  for  all  k  >  A’o,  then  for  any  p  >  n  +  1  there  exists  A'j  such  that  for 
all  k  >  A’[, 

ek+P  <  oef  (3.17) 

where  a  is  a  constant  and  r.j  is  defined  as  ||.r }  —  j.||. 


1  1 


Proof.  Since  V2/(x„)  is  positive  definite,  there  exists  K  i,  /3\  >  0  and  P2  >  0  such  that 

Mf(xk)  ~  /(-r-)]^  <  N-cjt  -  2--II  <  fcUixk)  -  (3.18) 

for  all  k  >  I\\.  Therefore,  since  we  have  a  descent  method,  for  all  /  >  k  >  AT ,  ||xj  -x„||  < 

J 

-~||n  -  x»||.  Now,  given  k  >  AT  we  apply  Lemma  3.2  to  the  set  {s^,  sjt+j, . . . ,  sk+n}- 
Pi 

Thus,  there  exists  lk  G  {k  +  1, . . . ,  k  +  n)  such  that 


||(B,,-vV(i.)KII  ,y/n 
IM  U‘V  ■ 


(3.19) 


(If  there  is  more  than  one  such  index  T,  we  choose  the  smallest.)  Equation  (3.19)  implies 
that  for  ||x/j  —  x„||  sufficiently  small,  by  Theorem  6.4  of  Dennis  and  More  [1977],  Algorithm 
2.1  will  choose  A q  =  1  so  that  x/1+1  =  +  si ,.  This  fact,  together  with  Lemma  3.3  and 

(3.19),  implies  that  if  e k  is  sufficiently  small  then 


(3.20) 


.  -  1  /n 

«/!+i  <  aek  e/, 

for  some  constant  d.  Now  we  can  apply  Lemma  3.2  to  the  set 

Uk,  Sk+l,  •  •  ■,Sk+n,sk+n+ 1}  -  {sii  } 

to  get  l2.  Repeating  this  n  —  p  times  we  get  a  set  of  integers  T  <  l2  <  ■  ■  •  <  lp-n •  with 
T  >  k  and  /p_n  <  k  +  p  such  that 


.  -  l/n 

ei,+i  <  aek  e,t 

for  each  Now  letting  hj  =  [/(x j)  —  /(x«)]: ,  since  we  have  a  descent  method, 


L)+i 


5:  hji 


and  using  (3.18)  we  have  that  for  our  arbitrary  k  >  AT, 


(3.21) 


(3.22) 


h-1,+1  < 

.  a  l/n 

i3iek  6,1 


.  afo  i/n , 

<  *<. 

for  i  =  1.2, . .  .,p  —  n.  Therefore  using  (3.22)  and  (3.23)  we  have  that 


h-k+p  ^ 


( Pp2  l/n 

V  01  6k  . 


p-n 


hk 


which,  by  (3.18)  implies  that 


.  02  fa32  ,/ny-n 

ek+p-JiVKek  )  €k- 


Therefore. 


and  3.17  follows. 


?k  +  p 


<  ap~n  (—) 

\  /h  J 


p-n+l 


0P/n 
-k  ’ 


□ 


(3.23) 
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Corollary  3.1  Under  the  assumptions  of  Theorem  3.1  the  sequence  {xk}  generated  by 
Algorithm  2.1  is  n  +  1-step  ^-superlinear,  i.e., 

^k+n- f-1  _ ^  q 

and  is  2/i-step  ^-quadratic,  i.e., 

limsup  Gk~-  <  oc. 

00  4 

Proof.  Let  p  =  n  +  1,  and  p  =  2n  in  Theorem  3.1.  □ 

Note  that  a  2n-step  q  quadratically  convergent  sequence  has  an  r  order  of 
Since  the  integer  p  in  the  theorem  is  arbitrary,  an  interesting,  purely  theoretical  question 
is  what  value  of  p  will  prove  the  highest  r-convorgence  order  for  the  sequence.  It  is  not 
hard  to  show  that,  by  choosing  p  to  be  an  integer  close  to  en,  the  r  order  approaches 
e7Z  ss  1.-14"  for  n  sufficiently  large,  and  that  this  value  is  optimal  for  this  technique  of 
analysis. 

4.  Positive  Definiteness  of  the  SRI  Update 

One  of  the  requirements  in  Theorem  3.1  for  the  rate  of  convergence  to  be  p-step  q- 
superlinear,  is  that  the  sequence  {Bk}  generated  by  the  SRI  method  be  positive  definite. 
Actually,  the  proof  of  Theorem  3.1  only  requires  positive  definiteness  of  Bk  at  the  p  -  n 
out  of  p  '’good  iterations.”  In  this  section,  we  present  computational  results  to  confirm 
that  in  practice,  the  SRI  method  generally  satisfies  this  requirement. 

I11  Table  AS  in  the  appendix,  in  the  fourth  column,  we  report  for  each  iteration 
whether  Bk  is  positive  definite  or  not.  The  5th  column  reports  the  percentage  of  iterates 
at  which  the  SRI  update  is  positive  definite,  and  the  6th  column  contains  the  largest 
number  j  for  which  all  of  . . . ,  Bi  are  positive  definite,  where  Bi  is  the  Hessian 

approximation  at  the  final  iterate.  The  results  of  Table  A8  are  summarized  in  Table  4, 
which  indicates  that  the  SRI  formula  was  positive  definite  at  least  70%  of  the  time 
on  every  one  of  our  test  problems.  In  light  of  this,  and  since  Theorem  3.1  really  only 
requires  positive  definiteness  at  the  “good  steps”  (at  other  steps  all  that  is  needed  is  that 
/  be  reduced),  chances  that  superlinear  steps  will  be  taken  at  least  every  n  steps  by  the 
algorithm  seem  good.  Another  wray  of  viewing  this  is  that,  we  know  from  Theorem  3.1 
that  out  of  every  2 n  steps,  at  least  n  will  be  ’’good  steps”  so  long  as  Bk  is  positive  definite 
at  these  iterations.  Thus,  if  for  example  Bk  is  positive  definite  at  80%  of  these  2 n  steps, 
at  least  30%  of  the  2 n  iterates  must  be  ”good  steps.” 


Table  7:  Percentage  of  Iterations  with  Bk  Positive  Definite 


% 

<  "0 

[70,90) 

[80,90) 

[90, 100) 

100 

problems 

0 

5 

12 

6 

5 

We  also  tested  the  denominator  condition  that 

\4k(!jk-BkSk)\>r\\,k\\\\!jk-BkH\\  M-l) 
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where  r  =  10~8  using  standard  initial  points.  The  last  column  in  Table  A8,  which  reports 
the  number  of  times  this  condition  was  violated,  indicates  that  this  condition  rarely  is 
violated  in  practice.  This  finding  is  consistent  with  the  results  of  Conn,  Gould,  and 
Toint  [1988b]. 

Finally  we  present  an  example  that  shows  that  it  is  possible  for  a  line  search  SRI 
algorithm  to  fail  to  have  Bk  positive  definite  at  all  iterations,  and  to  converge  linearly  to 
the  minimizer  x..  This  shows  that  the  assumptions  of  Theorem  3.1  cannot  be  guaranteed 
to  hold.  We  then  consider  the  same  example  in  a  trust  region  SRI  algorithm,  and  show 
that  it  does  not  suffer  from  the  same  problems.  This  leads  us  to  feel  that  it  may  not  be 
necessary  to  assume  {i^}  positive  definite  in  order  to  prove  superlinear  convergence  for 
a  trust  region  SRI  method. 


Example  4.1  Let 


fix) 


1  j 

2X  x'  x°  = 


a 

0 


,  and  Bq  = 


1  0 
0  a 


where  a  <  0.  At  the  first  iteration,  the  algorithm  will  compute 


x\  =  xo  - 


l  +  ^o  0 
0  a  +  S0 


Vf(x0)  = 


6p 

1  +  <^o 


Xo 


for  some  £o  >  — cr,  and  accept  this  point  as  the  next  iterate.  The  SRI  update  will  produce 
y0  -  Bq$q  =  0,  so  that  By  =  B0.  The  remaining  iterates  proceed  analogously,  so  that  for 
each  k,  Bk  =  B0  and 

h 

x‘+1  =  TTTkXk 

for  some  6k  >  —  cr,  meaning  that  the  rate  of  convergence  is  no  better  than  linear  with 

M 

constant  - — - — :. 

1  +  M 

It  is  interesting  to  consider  the  behavior  on  the  same  problem  of  a  trust  region  SRI 
algorithm  that  exactly  solves  the  problem 

min  V f{xk)Ts  -j-  \sT  BkS  subject  to  ||s||  <  Afc  (-1-2) 

•s6fir'  2 

at  each  iteration.  If  there  exists  no  such  that  Bo  +  fiol  is  positive  definite  and  ||(f?o  + 

^oO-1^7/(;ro)||  =  ^0i  then  as  in  the  line  search  method,  xy  =  — — — xo  and  By  =  Bo- 

1  +  Mo 

Since  aredo  =predo,  the  trust  region  radius  is  not  decreased.  Thus  eventually  at  some 
iterate  k,  we  must  have  \\(Bk  +  Mfc-0-1^" /(^Oll  <  for  all  /j-k  >  —  At,  where  A k  <  0  is 
the  smallest  eigenvalue  of  Bk-  In  this  case  the  solution  to  (4.2)  is  the  step 


Xk  + 1 


Xk  -  (Bk  -  Afc/)+V/(x*)  -  t/e2 


for  a.  i/  /  0  that  makes  ||.s;.||  =  A<...  (Here  —  (0 ,  1  )7  is  the  eigenvector  of  Bk 
corresponding  to  the  negative  eigenvalue.)  It  is  then  straightforward  to  verify  that 
!Jk  -  Rk*k  =  -  l)e2.  Bk  + 1  =  /  =  V2/(x)  and  xk+7  =  x... 
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A  practical  trust  region  algorithm  will  not  solve  (-1.2)  exactly,  but  any  algorithm  that 
deals  with  the  ’’hard  case"’  (when  ||(Z?jt  -  A<;/)+V/(.r<;)||  <  Afc)  well,  such  as  algorithms 
of  More  and  Sorenson  [19S2],  will  have  the  same  effect.  That  is,  at  some  point  it  will  set 

Xfc+1  =  Xk~  (Bk  +  Vf(xk)  -  vk 

where  vk  is  a  negative  curvature  direction  for  Bk-  This  implies  that  vje2  r-  0,  which  in 
turn  leads  to  Bk+ 1  =  /  and  xk+2  =  x..  Thus  the  trust  region  method  has  the  ability, 
for  this  example,  to  correct  negative  eigenvalues  in  the  Hessian  approximation.  This 
indicates  that  it  may  be  possible  to  establish  superlinear  convergence  of  a  trust  region 
SRI  algorithm  without  assuming  a  priori  either  strong  linear  independence  of  the  iterates 
or  positive  definiteness  of  {Bk}-  This  issue  is  currently  under  investigation. 

5.  Conclusions  and  Future  Research 

We  have  attempted,  in  this  paper,  to  investigate  theoretical  and  numerical  aspects  of 
quasi-Xewton  methods  that  are  based  on  the  SRI  formula  for  the  Hessian  approximation. 
We  considered  both  line  search  and  trust  region  algorithms. 

We  tested  the  SRI  method  on  a  fairly  large  number  of  standard  test  problems  from 
More,  Garbow,  and  Hillstorm  [1981],  and  Conn,  Gould,  and  Toint  [1988b].  Our  test 
results  show  that  on  the  set  of  problems  we  tried,  the  SRI  method,  on  the  average, 
requires  somewhat  fewer  iterations  and  function  evaluations  than  the  BFGS  method 
in  both  line  search  and  trust  region  algorithms.  Although  there  is  no  result  for  the 
BFGS  method  concerning  the  convergence  of  the  sequence  of  approximating  matrices  to 
the  correct  Hessian  like  the  one  given  by  Conn,  Gould,  and  Toint  [1991]  for  the  SRI, 
numerical  tests  do  not  show  that  the  SRI  method  is  more  accurate  than  the  BFGS 
method  in  this  regard.  One  reason  for  this,  as  indicated  by  our  numerical  experiments, 
is  that  the  requirement  of  uniform  linear  independence  that  is  needed  by  the  theory  of 
Conn,  Gould,  and  Toint  [1991]  often  fails  to  be  satisfied  in  practice. 

Under  conditions  that  do  not  assume  uniform  linear  independence  of  the  generated 
steps,  but  do  assume  positive  definiteness  and  boundedness  of  the  Hessian  approxima¬ 
tions,  we  were  able  to  prove  n  +  1-step  ^-superlinear  convergence,  and  2 n  step  quadratic 
convergence,  of  a  line  search  SRI  method.  We  also  gave  numerical  evidence  that  the  SRI 
update  is  positive  definite  most  of  the  time,  and  that  one  of  the  potential  problems  of 
the  formula,  that  of  the  denominator  being  zero,  is  rarely  encountered  in  practice. 

An  interesting  topic  for  future  research  that  was  mentioned  in  Section  4  is  the  con¬ 
vergence  analysis  of  a  trust  region  SRI  method,  again  without  the  assumption  of  uniform 
linear  independence  of  steps.  It  is  possible  that  the  assumption  of  the  positive  definite¬ 
ness  of  the  Hessian  approximations,  which  we  showed  is  necessary  and  sufficient  to  prove 
superlinear  convergence  in  the  line  search  SRI  method,  may  not  be  necessary  to  prove 
superlinear  convergence  for  a  properly  chosen  trust  region  SRI  algorithm. 
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Appendix 


Table  Al: 

List  of  Test  Functions  Numbers  and  Names. 

X  umber 

Dimension 

Name 

MGH05 

2 

Beale  Function. 

MG  HOT 

2 

Helical  Valley  Function. 

MGH09 

3 

Gaussian  Function. 

MGH12 

3 

Box  3-Dimensional  Function. 

MGII  14 

3 

Wood  Function. 

MGH16 

4 

Brown  and  Dennis  Function. 

MGII  18 

4 

Biggs  Exp6  Function. 

MG  1120 

6 

Watson  Function. 

MGH21 

9 

Extended  Rosenbrock  Function. 

MGH22 

10 

Extended  Powell  Singular  Function. 

MGII23 

10 

Penalty  Function  I. 

MGII24 

10 

Penalty  Function  II. 

MGH25 

10 

Variably  Dimensioned  Function. 

MGH26 

10 

Trigonometric  Function. 

MGH.35 

9 

Chebyquad  Function. 

CGT01 

8 

Generalized  Rosenbrock  Function. 

CGT02 

25 

Chained  Rosenbrock  Function. 

CGT04 

20 

Generalized  Singular  Function. 

CGT05 

20 

Chained  Singular  Function. 

CGT07 

8 

Generalized  Wood  Function. 

CGT08 

8 

Chained  Wood  Function. 

CGT10 

30 

A  Generalized  Broyden  Tridiagonal  Function. 

CGT11 

30 

Another  Generalized  Broyden  Tridiagonal  Function. 

CGT12 

30 

Generalized  Broyden  Banded  Function. 

CGT14 

30 

Toint ’s  7-diagonal  generalization  of  Broyden  Tri- 
diagonal  Function  (see  Toint  1978). 

CGT16 

30 

Trigonometric  Function  (Toint,  1978). 

CGT17 

8 

A  Generalized  Cragg  and  Levy  Function. 

CGT21 

30 

A  Generalized  Brown  Function. 

MGII:  problems  from  More,  Garbow,  and  Hillstrom  [1981]. 
C'GT:  problems  from  Conn,  Gould,  and  Toint  [1987]. 
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Table  A2:  Iterations  and  Function  Evaluations  -  Line  Search 


Function 

!  n 

BFGS 

SRI 

sp 

itrn. 

f-eval 

rgx 

itrn. 

f-eval 

rgx 

MGH05 

2 

16 

58 

0.7E-06 

14 

52 

0.1E-05 

1 

MGH07 

3 

26 

141 

0.4E-05 

30 

142 

0.4E-06 

1 

MGH09 

3 

5 

34 

0.3E-05 

3 

26 

0.2E-07 

1 

MGH12 

3 

35 

157 

0.5E-06 

21 

99 

0.6E-06 

1 

MGH14 

4 

32 

186 

0.7E-05 

26 

160 

0.5E-05 

1 

MC.H16 

4 

31 

183 

0.1E-05 

21 

133 

0.3E-07 

1 

MGH18 

6 

43 

336 

0.2E-05 

37 

302 

O.6E-O0 

1 

MGH20 

9 

95 

46 

532 

0.8E-05 

1 

MGH21 

10 

34 

461 

0.9E-05 

34 

462 

0.3E-05 

1 

MGII22 

8 

45 

464 

0.7E-05 

36 

382 

0.4E-05 

1 

MGH23 

10 

135 

204 

Egg 

0.6E-05 

1 

MGII24 

10 

25 

358 

0.7E-05 

25 

362 

0.8E-05 

1 

MGH25 

10 

16 

259 

0.7E-06 

16 

259 

0.7E-06 

1 

MGII26 

10 

27 

374 

0.3E-05 

27 

375 

0.2E-05 

1 

MGH35 

9 

25 

320 

0.2E-05 

25 

320 

0.3E-06 

1 

MGH05 

2 

47 

154 

0.3E-07 

41 

1.39 

MG  HOT 

3 

29 

136 

0.6E-06 

38 

175 

10 

MGH09 

3 

20 

98 

0.1E-05 

17 

102 

0.3E-06 

10 

MGH12 

3 

66 

286 

0.5E-05 

55 

259 

0.5E-05 

10 

MGH14 

m 

58 

316 

0.6E-05 

69 

379 

MG  II 16 

D 

59 

322 

0.3E-05 

37 

212 

0.1E-05 

10 

MG  II 18 

B 

■3 

361 

0.3E-05 

46 

369 

0.1E-05 

10 

MG  1120 

9 

95 

1020 

0.2E-05 

46 

532 

0.8E-05 

10 

MGH21 

10 

57 

775 

0.3E-05 

60 

813 

0.4E-07 

10 

MG  1122 

8 

88 

977 

0.9E-05 

67 

793 

0.3E-05 

10 
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Table  A2  (continued) 


Function 

n 

BFGS 

SRI 

sp 

itrn. 

f-eval 

rgx 

rgx 

MGH23 

10 

177 

2080 

0.9E-05 

192 

MGH25 

10 

41 

535 

0.3E-05 

23 

0.3E-05 

10 

MC.H26 

10 

72 

0.7E-05 

mm 

560 

0.9E-06 

10 

MGH07 

3 

31 

0.4E-06 

—an 

113 

mm 

100 

MGII14 

n 

ms 

0.5E-06 

104 

567 

0.5E-05 

100 

MGH16 

4 

B3MI 

472 

0.2E-05 

55 

303 

0.3E-06 

100 

95 

1020 

0.2E-05 

0.8E-05 

100 

MGU21 

L 10 

158 

2185 

0.SE-05 

154 

1227 

0.4  E- 05 

90 

875 

0.9E-05 

100 

MGH25 

10 

472 

5276 

0.1E-04 

335 

3769 

0.1E-04 

100 

CGI  01 

8 

71 

707 

0.5E-05 

81 

843 

0.4E-06 

1 

25 

36 

1315 

0.7E-05 

43 

1505 

1 

2049 

0.9E-05 

49 

1291 

0.5E-05 

1 

CGT05 

20 

311 

6797 

0.8E-05 

180 

4055 

0.9E-05 

1 

CGT07 

8 

116 

1132 

1 

CGT08 

8 

141 

1348 

0.5E-05 

1 

CGT10 

30 

58 

2328 

0.9E-05 

1 

CGT11 

30 

37 

1686 

0.3E-05 

1 

1526 

0.8E-05 

1 

CGT12 

30 

264 

8734 

0.6E-05 

1 

KftSiW 

30 

100 

3640 

0.9E-05 

1 

11 

203 

0.4E-05 

11 

204 

0.2E-05 

1 

l! 

1269 

92 

892 

0.3E-05 

1 

CGT21 

504 

0.2E-05 

11 

483 

0.3E-09 

1 

1 


Table  A3:  Iterations  and  Function  Evaluations  -  Trust  Region 


Function 

n 

Bl'GS 

SRI 

sp 

it  rn. 

f-eval 

rgx 

itrn. 

f-eval 

rgx 

MG  1105 

2 

15 

57 

0.3E-06 

16 

68 

0.5E-05 

1 

MGII07 

3 

27 

133 

29 

150 

n 

MGH09 

3 

5 

3 

31 

0.2E-07 

n 

MG1I12 

D 

1 

26 

146 

0.SE-05 

i 

MG  II 14 

n 

oa 

0.4E-07 

34 

247 

0.5E-05 

i 

MG  11 10 

m 

oa 

0.1E-05 

mm. 

138 

0.7E-05 

i 

MG  If  I  S 

G 

gp 

mm 

mm 

0.8E-05 

i 

MG  1120 

9 

B 

m 

46 

584 

0.3E-05 

i 

MG1I21 

10 

aa 

msm 

mm 

49 

671 

0.2E-06 

i 

MG  II 22 

8 

•a 

e m 

26 

294 

0.SE-05 

i 

MG  112  1 

10 

21 

344 

0.2  E-  05 

24 

357 

0.8E-05 

i 

j  MG  II 25 

10 

14 

236 

0.6E-05 

14 

236 

0.6E-05 

i 

|  MG  11 20 

10 

27 

373 

0.2E-05 

24 

349 

0.1E-05 

i 

MG II 35  j  9 

24 

308 

0.4E-05 

21 

285 

0.3  E- 05 

i 

ilHHHifcMSM 

■a 

160 

0.9E-05 

36 

147 

0.9E-06 

10 

MGII07 

3 

29 

141 

0.1E-05 

0.4E-05 

10 

MG  1109 

3 

21 

112 

84 

0.9E-05 

10 

MG  II 12 

3 

62 

292 

0.9E-06 

19 

0.7E-05 

10 

|  MG II 11  j  l 

$2 

443 

0.6  E- 06 

71 

467 

0.SE-06 

10 

MG II 10  j  1 

59 

32  1 

0.5E-00 

35 

0.8E-07 

10 

|  MG II  is 

6 

39 

323 

51 

437 

0.6E-07 

10 

MGH20  1  9 

SS 

957 

0.3E-05 

46 

F  584 

0.3  E- 05 

10 

MG  1121  i  10 

03 

78.s 

0.3E-05  j  58 

800 

0.2E-05 

10 

MG  1122  |  m 

94 

913 

0.5 E- 05  56 

575 

0.SE-05 

10 

21 


Table  A3  (continued) 


Function 

n 

BFGS 

SRI 

sp 

itrn. 

f-eval 

rgx 

itrn. 

f-eval 

rgx 

MG  1123 

10 

22 

337 

0.4H-05 

113 

1335 

0.SE-05 

10 

MG  II 24 

10 

224 

2609 

0.1E-04 

253 

3140 

0.1E-04 

10 

MGH25 

10 

36 

488 

0.7E-05 

mm 

371 

0.3E-05 

10 

MGH26 

10 

87 

1040 

0.7E-05 

mm 

650 

MG  HOT 

3 

L_H_ 

158 

0.2E-05 

118 

0.2E-05 

100 

MG  HI  4 

4 

85 

471 

0.1E-05 

69 

426 

0.3E-05 

100 

MGH16 

4 

89 

472 

0.4E-06 

52 

311 

0.1E-04 

100 

|  MGH20 

9 

88 

957 

0.3E-05 

46 

584 

0.3E-05 

100 

|  MGH21 

10 

165 

1941 

0.2E-05 

149 

2139 

0.3E-06 

100 

MGH22 

$ 

116 

1127 

80 

840 

0.2E-05 

100 

CGT01 

3 

58 

584 

0.7E-05 

80 

84S 

0.8E-05 

1 

CGT02 

25 

45 

1550 

0.4E-05 

46 

1597 

1 

CGTOI 

20 

110 

2579 

0.3E-05 

89 

2195 

0.5E-05 

1 

CGT05 

20 

323 

156 

3645 

O.SE-05 

1 

CGT07 

S 

123 

1190 

0.4E-05 

139 

1429 

1 

CGTOS 

,s 

130 

1255 

0.9E-05 

146 

1524 

0.5E-05 

1 

CGT10 

30 

58 

2326 

0.9E-05 

mm 

1832 

1 

CGT11 

30  |  35 

1619 

0.3E-05 

mm 

CGT12  |  30 

mm 

2454 

0.8E-05 

KSB 

1916 

0.5E-05 

1 

£ 

Da 

0.4  E- 05 

ii 

206 

1 

CGT17 

3 

S3  |  818 

0.9E-05 

74 

802 

0.8E-05 

1 

CGT21  |  20 

12 

0.3E-09 

1 

Table  A4:  Testing  Convergence  of  {Bk}  to  V2/(x„)  -  Line  Search 


Function 

n 

BFGS 

SRI 

itr 

itr 

11/7,-5,11/11^11 

2 

19 

0.458E-04 

16 

0.686E-05 

MGH07 

3 

28 

0.274E-04 

33 

0.175E-06 

MGH09 

3 

9 

0.918E+00 

4 

0.918E+00 

MG  1112 

3 

38 

0.545E-04 

24 

0.147E-03 

MG  H 1-1 

4 

K1 

0.830E-02 

29 

0.154E-04 

MGH16 

-1 

mm 

0.928E-01 

23 

0.348E-04 

mmmm 

6 

mm 

0.234E+01 

40 

0.234E+01 

9 

1175 

0.105E+00 

100 

0.264E-02 

mm 

0.804E-01 

34 

0.645E-01 

MGH22 

8 

E9 

0.161E+01 

49 

0.160E+ 01 

MGH23 

10 

178 

0.167E+04 

215 

0.167E+04 

MG  1124 

10 

348 

0.177E-01 

330 

0.140E-03 

MGH25 

10 

16 

0.748E+04 

16 

0.74SE+04 

MGH26 

0.689E-01 

31 

0.468E-01 

MGH35 

9 

28 

0.834E+00 

26 

0.833E+00 

CGT01 

8 

73 

0.393E-01 

83 

CGT02 

25 

43 

0.570E-01 

50 

0.317E-01 

CGT04 

20 

500 

0.133E+04 

500 

0.133E+ 04 

CGT05 

0.582E+03 

500 

0.503E+03 

CGT07 

8 

138 

0.691E-01 

124 

0.111E-01 

CGTOS 

8 

147 

0.425E-01 

146 

0.492E-02 

CGT10 

0.134E+03 

8-1 

0.185E+03 

CGT11 

30 

44 

0.781E-01 

37 

0.448E-01 

CGT12 

0.3S4E+00 

210 

0.691E-01 

CGT14 

30 

86 

0.279E+00 

107 

0.303E+00 

CGT16 

10 

18 

0.466E-01 

16 

0.3S5E-03 

CGT17 

8 

216 

0.462E+00 

125 

0.566E-01 

CGT21 

20 

16 

0.124E+01 

12 

0.120E+01 

20 


Table  A5:  Testing  Convergence  of  {Z^}  to  V2/(x.)  -  Trust  Region 


BFGS  SRI 

itr 

ii n,  -  m/\\n,\\ 

itr 

II  Ht-BtW/WHtW 

MG  1105 

2 

17 

0.235E-02 

18 

0.102E-05 

MGH07 

3 

30 

0.400E-02 

31 

0.172E-05 

MGH09 

3 

9 

0.918E+00 

4 

0.918E+00 

MGHl‘2 

3 

36 

0.396E-02 

30 

0.473E-02 

4 

Ea 

0.216E-02 

41 

0.290E-05 

MGH16 

4 

36 

0.809E-01 

mm 

0.369E-04 

47 

0.234E+01 

in 

0.234E+01 

MGH20 

9 

157 

0.261E-01 

0.176E-02 

MGH21 

10 

Ea 

0.999E+00 

51 

0.999E+00 

MGH22 

8 

77 

0.277E+01 

43 

0.276E+01 

MGH23 

149 

0.218E+04 

MGH24 

0.391E-02 

202 

0.173E+02 

MGH25 

10 

15 

0.103E+05 

15 

0.103E+05 

10 

31 

0.906E-01 

28 

0.234E-01 

MGH35 

9 

28 

0.880E+00 

23 

■he 

CGT01 

8 

61 

0.110E+00 

81 

■EEQESlH 

CGT02 

25 

51 

0.22SE+00 

50 

0.107E+00 

CGT04 

0.314E+04 

500 

0.248E+04 

■ttetnusf 

20 

500 

0.104E+04 

500 

0.671E+03 

CGT07 

8 

122 

0.354E-01 

138 

0.579E-02 

CGT03 

8 

138 

0.532E-01 

139 

0.405E-04 

CGT10 

30 

115 

0.109E+03 

82 

0.112E+03 

CGT11 

30 

40 

0.982E-01 

34 

0.690E-01 

CGT12 

30 

97 

0.770E+03 

66 

0.756E+03 

CGT14 

30 

46 

0.220E+00 

40 

0.160E-01 

CGT16 

10 

16 

0.523E-01 

15 

0.298E-02 

CGT17 

8 

200 

0.250E-i-00 

123 

0.1 17E-01 

CGT21 

20 

16 

0.124E+01 

12 

0.120E+01 

Table  A6:  Testing  Uniform  Linear  Independence  of  {$;..}  -  Line  Search 


f(x) 

No.  of  steps  so  that  > 

ESI 

OBI 

FTittl 

StiOM 

WiBM 

KEU 

10“' 

KEB9 

MGH05 

2 

16 

3 

2 

2 

2 

2 

2 

2 

2 

MG  HOT 

3 

33 

4 

3 

3 

3 

3 

3 

3 

3 

MGH09 

3 

4 

* 

* 

* 

* 

* 

* 

* 

* 

24 

14 

5 

3 

3 

3 

3 

3 

3 

29 

10 

5 

5 

4 

4 

4 

4 

4 

MGH16 

4 

23 

6 

4 

4 

4 

4 

4 

4 

4 

40 

* 

* 

* 

* 

* 

* 

* 

* 

MGH20 

9 

100 

74 

70 

67 

64 

63 

62 

61 

60 

MGH21 

10 

34 

* 

* 

* 

* 

* 

* 

* 

* 

S 

49 

* 

* 

* 

* 

* 

* 

* 

* 

215 

77 

77 

my  mm 

1 1 

77 

77 

77 

mm 

77 

MGH24 

10 

330 

79 

79 

79 

79 

79 

79 

79 

79 

16 

* 

* 

* 

* 

* 

* 

* 

★ 

31 

30 

16 

10 

10 

10 

10 

10 

10 

MGH35 

9 

26 

* 

* 

* 

* 

* 

* 

* 

* 

CGT01 

8 

83 

26 

15 

13 

13 

13 

13 

13 

13 

CGT02 

25 

50 

47 

28 

25 

25 

25 

25 

mm 

25 

87 

87 

87 

87 

87 

87 

87 

87 

CGT05 

20 

500 

87 

mm 

87 

87 

87 

87 

87 

87 

CGT07 

8 

124 

76 

76 

76 

42 

34 

34 

34 

34 

CGT08 

8 

146 

mm 

45 

45 

45 

45 

45 

45 

45 

CGT10 

30 

84 

* 

* 

60 

34 

30 

30 

30 

30 

CGT11 

30 

37 

35 

33 

30 

30 

30 

30 

30 

30 

CGT12 

98 

98 

88 

88 

88 

88 

88 

88 

CGT14 

30 

107 

59 

36 

36 

36 

36 

36 

36 

36 

11 

10 

10 

10 

10 

10 

10 

10 

CGT17 

8 

125 

67 

42 

34 

34 

34 

34 

34 

12 

* 

* 

* 

* 

4 

* 

* 

* 

*  S 


m 


[s(/||si||?s(— i/||.s(-,|| . sj_m/||si_m||],  where  m  >  n. 
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Table  AT:  Testing  Uniform  Linear  Independence  of  { s t}  -  Trust  Region 


n 

Itr. 

No.  of  steps  so  that  crmtn(5m)*  > 

WiBl 

rnnaiM 

WiBl 

Q2I 

KHfil 

WiBl 

10“‘ 

IE2I 

MGH05 

2 

18 

3 

2 

2 

2 

2 

2 

2 

2 

MG  1107 

3 

31 

5 

3 

3 

3 

3 

3 

3 

3 

MGH09 

3 

4 

* 

* 

* 

* 

* 

* 

* 

* 

MGH12 

3 

30 

J 

6 

5 

3 

3 

3 

3 

3 

MGH14 

4 

41 

8 

5 

4 

4 

4 

4 

4 

4 

MGH16 

4 

22 

5 

4 

4 

4 

4 

4 

4 

4 

MGH18 

6 

40 

* 

* 

* 

* 

* 

* 

* 

* 

MGIT20 

9 

99 

75 

64 

63 

62 

62 

61 

61 

61 

MGH‘21 

10 

51 

* 

* 

* 

* 

* 

* 

* 

* 

MG  II 22 

8 

43 

* 

* 

* 

* 

* 

* 

* 

* 

MG  1123 

10 

149 

77 

77 

77 

77 

77 

77 

77 

77 

MG  PI  2-1 

10 

202 

79 

79 

79 

74 

74 

74 

74 

74 

MGH25 

10 

15 

* 

* 

* 

* 

* 

* 

* 

* 

MGH26 

10 

28 

26 

18 

10 

10 

10 

10 

10 

10 

MGH35 

9 

23 

* 

* 

* 

* 

* 

* 

* 

* 

CGT01 

8 

81 

32 

17 

13 

12 

12 

12 

12 

12 

CGT02 

25 

50 

* 

29 

26 

25 

25 

25 

25 

25 

CGT04 

88 

88 

88 

88 

88 

88 

88 

88 

CGT05 

20 

500 

88 

87 

87 

87 

87 

87 

87 

87 

CGT07 

8 

138 

76 

76 

50 

43 

41 

41 

41 

41 

CGT08 

8 

139 

41 

41 

41 

41 

41 

41 

41 

41 

CGT10 

30 

82 

* 

* 

59 

36 

32 

30 

30 

30 

CGT11 

34 

* 

31 

30 

30 

30 

30 

30 

30 

CGTI2 

30 

66 

* 

* 

It 

60 

40 

31 

30 

30 

CGT14 

* 

33 

30 

30 

30 

30 

30 

30 

CGTI6 

10 

15 

12 

10 

10 

10 

10 

10 

10 

10 

CGT17 

8 

123 

73 

49 

39 

34 

33 

33 

33 

33 

CGT21 

20 

12 

_ ! _ 

* 

* 

* 

* 

* 

* 

* 

*  Sm  =  [-^/ /!!•'/ II •  i / 1[^/—  1 1| » •  •  • '  •S/— m / ||-S/— m ||]i  where  m  >  n. 
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Table  A8:  Testing  Positive  Definiteness  -  Line  Search 


/(*) 

n 

Itr. 

0:  Indefinite  ;  1:  Positive  Definite 

%pd 

1* 

2* 

MGH05 

2 

14 

1111111111111 

1.00 

13 

1 

MGH07 

3 

30 

11111101111011110111111111111 

0.90 

12 

1 

MGH09 

3 

3 

11 

1.00 

2 

1 

MGH12 

3 

21 

11111111111111111111 

1.00 

20 

1 

MGH14 

4 

26 

1111111101111110111110111 

0.88 

3 

1 

MGH16 

4 

21 

10111111111111111111 

0.95 

18 

1 

MGH18 

6 

37 

111111100111111111111111011111111111 

0.92 

11 

1 

MGH20 

9 

46 

1111011111111111011111011101101111110 

11111011 

0.84 

2 

1 

MGH21 

10 

34 

111011111110111101001111111111111 

0.85 

13 

1 

MGH22 

8 

36 

11111101011111111111111110111111111 

0.91 

9 

1 

MG  1123 

10 

204 

111111111111111111101111111111101111 

111011101101101001101001111011110111 

111111011010001111100111111101110011 

111101011111101111010100110101111110 

111101101111111010011011101111011001 

11111011111101111110111 

0.77 

3 

0 

MGH24 

10 

25 

111111101110111110111111 

0.88 

6 

1 

MGH25 

10 

16 

111111111111111 

1.00 

15 

0 

MGH26 

10 

27 

11101110111011101101110111 

0.77 

3 

1 

MGH35 

9 

25 

111110110111110111111111 

0.88 

9 

1 

CGT01 

8 

81 

111111110011010011110101101111110100 

110111111011011101100110111011111011 

11111111 

0.75 

10 

1 

CGT02 

25 

43 

111111110011111110011011011011011111 

linn 

0.81 

11 

1 

CGT04 

20 

49 

111111111101111111011111101111111111 

111111111111 

0.94 

22 

1 

1  +  :  Number  of  consecutive  iterations  where  Dk  was  positive  definite  immediately  prior 
to  the  termination  of  the  algorithm. 

2*:  Number  of  iterations  where  the  SRI  update  is  skipped  because  condition  (4.1)  was 
violated. 
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Table  A8  (continued) 


/(*) 

n 

Itr 

0:  Indefinite  ;  1:  Positive  Definite 

%pd 

1* 

2* 

CGT05 

20 

180 

111111111011111011111111111101110111 

111111111111111010111101111111110111 

111111110111011010001110111111101111 

111111111010111111011011111001110111 

11111111111110111111111111111111111 

0.87 

21 

1 

CGT07 

8 

116 

111111111111111110111111101000011011 

010010011111101011010011011101111011 

011111111111111101111011110110111111 

1111111 

0.78 

13 

1 

CGT08 

8 

140 

111111110110111110111011111101101101 

111110011011111101101110011011110100 

110110000000011110111111001110100111 

1110110011010011011111010111111 

0.70 

6 

1 

CGT10 

30 

40 

111111111111111111111111101111111111 

001 

0.92 

1 

1 

CGT11 

30 

32 

1111011101111111110111011111111 

0.87 

8 

1 

CGT12 

30 

199 

111111111110111111110110111101111111 

111110110111110111011101110111110111 

011111111110111011111101100111111010 

110011111111111010101101111111101011 

101111110011111011111110110011011111 

110101011101111101 

0.80 

1 

1 

CGT14 

30 

100 

111010111110111011101110011110110111 

111111101110111101101111111010101111 

111111111111110111111111111 

0.83 

12 

1 

CGT16 

10 

11 

1111111111 

1.00 

10 

1 

CGT17 

8 

92 

111111011111111101111110111101101111 

011111100111111111101111101111111101 

1111111110111111111 

0.87 

9 

1 

CGT21 

20 

11 

1110101111 

0.80 

4 

1 

1*:  Number  of  consecutive  iterations  where  Dk  was  positive  definite  immediately  prior 
to  the  termination  of  the  algorithm. 

2*:  Number  of  iterations  where  the  SRI  update  is  skipped  because  condition  (4.1)  was 
violated. 
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by  the  SRI  method  arc  not  generally  appreciably  better  than  those  produced  by  the  BFGS,  and  the  sequences  of 
steps  produced  by  the  SRI  do  not  usually  seem  to  have  the  "uniform  linear  independence"  property  that  is  assumed 
in  some  recent  convergence  analysis.  We  present  a  new  analysis  that  shows  that  the  SRI  method  with  a  line  search 
is  n+1  step  q-superlincarly  convergent  without  the  assumption  of  linearly  independent  iterates.  This  analysis 
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